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Abstract.  Software  systems  need  to  adapt  as  requirements  change,  en¬ 
vironment  conditions  vary,  and  bugs  are  discovered  and  fixed.  In  systems 
that  need  to  provide  continuous  operation,  it  is  important  that  the  adap¬ 
tation  be  done  with  minimal  interruption  in  the  execution  of  the  system. 
In  context  of  verification  of  these  adaptive  systems,  the  verification  needs 
to  be  done  for  the  system  before  adaptation,  for  the  system  during  adapta¬ 
tion,  and  for  the  system  after  adaptation.  While  the  existing  techniques  of 
program  verification  can  be  directly  applied  to  verify  the  system  before  and 
the  system  after  adaptation,  they  cannot  be  applied  directly  to  verify  the 
system  during  adaptation.  This  is  because  the  system  during  adaptation 
is  not  well-defined  as  it  consists  of  parts  of  both  the  old  system  (system 
before  adaptation)  and  the  new  system  (system  after  adaptation).  In  our 
previous  work,  we  presented  an  approach  based  on  adaptation  lattice  for 
verifying  the  correctness  of  dynamic  adaptation.  The  complexity  of  our 
approach  depends  on  the  size  (number  of  nodes  and  edges)  of  the  adapta¬ 
tion  lattice,  as  each  node  and  each  edge  in  the  lattice  needs  to  be  verified 
independently.  In  this  paper,  we  discuss  the  tradeoff  between  concurrency 
of  adaptation  and  complexity  of  verifying  that  adaptation. 


1  Introduction 

Software  systems  evolve  or  adapt  to  modify  their  functional  or  non-functional 
behavior.  This  adaptation  is  required  because  of  changes  in  requirements  and/or 
environment  conditions.  The  adaptation  may  reconfigure  system  parameters,  or 
add,  remove  or  replace  software  components.  Such  adaptation  can  be  done  either 
statically  (during  compile-time  or  load-time)  or  dynamically  (during  run-time). 
In  systems  that  are  required  to  provide  continuous  (uninterrupted)  service,  the 
adaptation  needs  to  be  done  dynamically.  In  this  paper,  we  focus  on  the  verification 
of  such  dynamic  adaptive  systems.  (For  brevity,  unless  mentioned  otherwise,  in 
this  paper,  we  use  adaptation  to  denote  dynamic  adaptation.) 

The  issues  in  adaptation  can  be  classified  into  two  broad  categories:  syntactic 
and  semantic.  Syntactic  issues  in  adaptation  deal  with  mechanisms  of  adapta¬ 
tion,  such  as,  parameter  checking,  interface  compatibility,  dynamic  loading,  and 
reflection.  Semantic  issues  address  the  correctness  aspect  of  adaptation,  such  as, 
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verifying  the  system  during  adaptation  and  the  system  after  adaptation.  Several 
approaches  (e.g.  [1-6])  addressing  adaptation  in  distributed  systems  have  been 
discussed  in  literature.  However,  these  approaches  mostly  address  the  syntactic 
issues  in  adaptation. 

Since  current  literature  does  not  sufficiently  address  the  semantic  issues  in 
adaptation,  in  [7],  we  focused  on  an  approach  for  verifying  adaptation  in  a  dis¬ 
tributed  system.  Since  such  adaptation  involves  modification  to  several  (if  not  all) 
processes  involved  in  the  system,  it  cannot  be  performed  atomically.  We,  therefore, 
introduced  the  notion  of  atomic  adaptation ;  typically  only  a  single  process  will  be 
involved  in  an  atomic  adaptation.  Thus,  the  adaptation  in  a  distributed  system 
can  be  modeled  as  a  sequence  of  several  atomic  adaptations.  It  follows  that  dur¬ 
ing  adaptation,  the  system  may  consist  of  parts  of  the  old  program  (i.e. ,  program 
before  adaptation)  and  parts  of  the  new  program  (i.e.,  program  after  adaptation). 
Hence,  to  guarantee  correctness  of  adaptation,  in  [7],  we  focus  on  the  properties 
of  such  intermediate  programs. 

To  verify  the  properties  of  an  adaptive  program,  in  [7],  we  introduced  an  adap¬ 
tation  lattice.  Specifically  this  lattice  enables  us  to  verify  properties  of  adaptation 
by  verifying  the  (safety)  properties  of  these  intermediate  programs.  Each  node 
in  the  lattice  corresponds  to  an  intermediate  program  and  needs  to  be  verified 
independently.  Further,  each  edge  in  the  lattice  also  needs  to  be  verified.  Thus, 
the  complexity  of  verification  depends  on  the  size  of  the  adaptation  lattice. 

In  this  paper,  we  investigate  approaches  for  reducing  the  size  of  the  adapta¬ 
tion  lattice  and,  thereby  reduce  the  cost  of  verification  of  adaptation.  Specifically, 
we  evaluate  the  tradeoff  between  concurrency  of  adaptation  and  the  verification 
complexity  of  that  adaptation.  We  show  that  such  a  tradeoff  that  is  previously 
considered  in  the  context  of  concurrency  and  verification  of  single  non-adaptive 
programs  —  can  be  extended  in  the  context  of  adaptive  systems.  We  illustrate 
this  tradeoff  using  the  case  study  in  which  the  adaptation  replaces  a  centralized 
mutual  exclusion  protocol  ([8])  with  a  distributed  mutual  exclusion  protocol  ([9]). 

Organization  of  the  paper.  In  Section  2,  we  discuss  how  distributed  pro¬ 
grams  and  adaptations  are  modeled.  We  describe  the  adaptation  lattice  used  in 
verification  of  adaptation  in  Section  3.  The  tradeoff  in  verifying  adaptation  is 
discussed  in  Section  4.  We  present  our  case  study  in  Section  5.  In  Section  6,  we 
discuss  some  issues  related  to  our  work,  and  finally  conclude  in  Section  7. 


2  Modeling  Adaptation 

In  this  section,  we  describe  how  we  model  the  adaptive  programs  in  order  to  show 
their  correctness.  This  modeling  is  based  on  our  previous  work  in  [7].  We  con¬ 
sider  adaptations  where  the  program  needs  to  add,  remove  or  replace  distributed 
protocols.  A  protocol  implements  a  part  of  the  desired  behavior  of  the  system.  It 
consists  of  one  of  more  fractions  [1] .  Each  fraction  is  associated  with  one  process 
of  the  program.  For  example,  consider  a  protocol  that  provides  secure  communi¬ 
cation  between  a  sender  and  a  receiver.  Such  a  protocol  consists  of  two  fractions, 
namely,  encryption  fraction  at  the  sender  that  encrypts  the  packets  before  send¬ 
ing  and  decryption  fraction  at  the  receiver  that  decrypts  the  encrypted  packets 
received  from  the  sender.  For  simplicity,  we  consider  only  one  fraction  at  each  pro- 


cess,  although,  our  approach  can  be  used  even  if  multiple  fractions  are  associated 
at  a  process. 

Atomic  Adaptation.  To  add  a  protocol  to  a  distributed  program,  each  frac¬ 
tion  of  the  protocol  needs  to  be  added  at  processes  of  the  program.  Similarly,  to 
remove  a  protocol,  each  fraction  of  the  component  needs  to  be  removed  from  the 
corresponding  process  of  the  program.  Thus,  an  adaptation  in  distributed  system 
involves  changes  at  various  processes  of  the  system.  In  other  words,  adaptation 
in  distributed  programs  involves  multiple  steps.  The  key  to  verifying  adaptation 
in  distributed  programs  is  identifying  individual  atomic  steps  in  adaptation.  We 
divide  this  multi-step  adaptation  into  multiple  atomic  adaptations  each  occur¬ 
ring  at  only  one  process.  Thus,  an  adaptation  consists  of  a  sequence  of  atomic 
adaptations.  Each  atomic  adaptation  a  special  type  of  action  (defined  later  in 
this  section)  and  is  represented  by  a  name  and  has  a  guard  (defined  later  in  this 
section).  An  atomic  adaptation  is  performed  when  the  guard  corresponding  to  it 
becomes  true. 

Intermediate  Program.  We  define  a  system  before  adaptation  as  old  pro¬ 
gram  and  a  system  after  adaptation  as  new  program.  Each  atomic  adaptation 
modifies  the  system  into  an  intermediate  program.  The  first  atomic  adaptation 
modifies  the  old  program  into  the  first  intermediate  program.  Similarly,  other 
atomic  adaptations  modifies  one  intermediate  program  into  the  next  intermediate 
program.  The  last  atomic  adaptation  results  into  the  new  program. 

In  the  case  study  discussed  in  Section  5,  we  consider  adaptation  that  performs 
replacement  of  protocols.  Addition  and  removal  of  protocols  are  considered  as 
special  case  of  replacement  (cf.  Section  6). 

Now,  we  discuss  how  we  model  such  distributed  programs  and  the  multi-step 
adaptation  performed  in  them.  We  note  that  general  purpose  languages  such  as 
C+- (-/Java  that  are  used  in  existing  adaptation  techniques  (e.g.,  [1-4, 10])  are 
not  suitable  for  our  task  as  verification  of  programs  in  these  languages  is  difficult 
even  in  the  absence  of  adaptation.  For  this  reason,  while  proving  correctness  of 
programs,  we  often  consider  their  abstract  version  where  several  implementation 
details  are  omitted.  We  use  a  variation  of  the  guarded  commands  [11]  for  our 
purpose.  We  note  that  there  are  techniques  [12]  that  allow  one  to  obtain  a  guarded 
command  representation  from  a  program  in  general  purpose  language  such  as  C. 
Also,  there  are  techniques  [13,14]  that  allow  one  to  transform  a  program  in  guarded 
commands  into  a  program  in  general  purpose  languages.  We  describe  the  modeling 
of  the  distributed  programs  and  their  adaptation,  next. 


2.1  Modeling  Adaptive  Program 

Program  and  Process.  A  program  V  is  specified  by  a  set  of  global  constants,  a 
set  of  global  variables  and  a  finite  set  of  processes.  A  process  p  is  specified  by  a  set 
of  local  constants,  a  set  of  local  variables  and  a  finite  set  of  actions  (defined  later  in 
this  section) .  Variables  declared  in  V  can  be  read  and  written  by  the  actions  of  all 
processes.  The  processes  in  a  program  communicate  with  one  another  by  sending 
and  receiving  messages  over  unbounded  channels  that  connect  the  processes.  We 
use  the  notation  ch.p.q  to  denote  the  content  of  the  channel  from  process  p  to 
process  q ,  and  to  denote  the  number  of  messages  in  the  channel  from  p 

to  q. 


Protocol  and  its  Fractions.  A  protocol  is  a  set  of  global  constants,  a  set 
of  global  variables,  and  a  finite  set  of  fractions  that  are  involved  in  providing  a 
common  functionality.  A  fraction  is  a  set  of  local  constants,  a  set  of  local  variables, 
and  a  finite  set  of  actions  that  are  associated  with  a  single  process.  A  protocol  and 
a  fraction  also  has  input  parameters,  which  are  values  of  the  variables  supplied  by 
the  program;  and  output  parameters,  which  are  values  returned  to  the  program. 
State  and  State  Predicate.  A  state  of  process  p  is  defined  by  a  value  for  each 
variable  of  p,  chosen  from  the  predefined  domain  of  the  variable.  A  state  of  program 
V  is  defined  by  a  value  for  each  global  variable  of  V.  the  state  of  all  its  processes 
and  the  contents  of  all  channels.  A  state  predicate  of  p  is  a  boolean  expression  over 
the  constants  and  variables  of  p.  A  state  predicate  of  V  is  a  boolean  expression  over 
constants  and  variables  of  all  processes,  all  global  constants  and  global  variables, 
and  the  contents  of  all  channels.  A  state  and  a  state  predicate  of  a  protocol  and 
a  fraction  are  defined  accordingly. 

Action.  An  action  of  p  is  uniquely  identified  by  a  name,  and  is  of  the  form 
{name)  :  {guard)  — ►  ( statement ) 

A  guard  is  a  combination  of  one  or  more  of  the  following:  a  state  predicate  of  p ,  a 
receiving  guard  of  p ,  or  a  timeout  guard.  A  receiving  guard  of  p  is  of  the  form  rev 
( message )  from  {q).  A  timeout  guard  is  of  the  form  timeout  {state  predicate  of  V). 
The  statement  of  an  action  updates  zero  or  more  variables  and/or  sends  one  or 
more  messages.  An  action  can  be  executed  only  if  its  guard  evaluates  to  true.  To 
execute  an  action,  the  statement  of  that  action  is  executed  atomically.  A  sending 
statement  of  p  is  of  the  form  send  ( message )  to  {qi, ...,  qn),  which  sends  a  message 
to  one  or  more  processes  .  We  say  that  an  action  of  p  is  enabled  in  a  state  of  V  iff 
its  guard  evaluates  to  true  in  that  state. 

Computation.  A  computation  of  V  is  a  sequence  of  states  so,  si,  such  that  for 
each  j,  j  >  0,  Sj  is  obtained  from  state  by  executing  an  action  of  V  that  is  en¬ 
abled  in  the  state  Sj_i,  and  satisfies  the  following  conditions:  (i)  Non- determinism: 
Any  action  that  is  enabled  in  a  state  of  V  can  be  selected  for  execution  at  that 
state,  {ii)  Atomicity:  Enabled  actions  in  V  are  executed  one  at  a  time,  (in)  Fair¬ 
ness:  If  an  action  is  continuously  enabled  along  the  states  in  the  sequence,  then 
that  action  is  eventually  chosen  for  execution,  and  (iv)  Maximality:  Maximality 
of  the  sequence  means  that  if  the  sequence  is  finite  then  the  guard  of  each  action 
in  V  is  false  in  the  final  state. 

Closure.  Let  S'  be  a  state  predicate.  An  action  ac  of  p  preserves  S  iff  in  any  state 
where  S  is  true  and  ac  is  enabled,  atomically  executing  the  statement  of  ac  yields 
a  state  where  S  continues  to  be  true.  S  is  closed  in  a  set  of  actions  iff  each  action 
in  that  set  preserves  S. 

Specification.  The  program  specification  is  a  set  of  acceptable  computations.  Fol¬ 
lowing  Alpern  and  Schneider  [15],  specification  can  be  decomposed  into  a  safety 
specification  and  a  liveness  specification.  Also,  as  shown  in  [16],  for  a  rich  class  of 
specifications,  safety  specification  can  be  represented  as  a  set  of  bad  transitions 
that  should  not  occur  in  program  computations.  Hence,  we  represent  this  safety 
specification  by  a  set  of  bad  transitions  that  should  not  occur  in  program  com¬ 
putations.  We  omit  the  representation  of  liveness  specification  here  as  it  is  not 
required  for  our  purpose. 

Satisfies.  V  satisfies  specification  from  S  iff  each  computation  of  V  that  starts 
from  a  state  where  S  is  true  is  in  the  specification. 


Invariant.  The  state  predicate  S  of  V  is  an  invariant  iff  (i)  S  is  closed  in  V,  and 
(■ ii )  V  satisfies  specification  from  S. 

Specification  during  adaptation.  We  argue  that  while  the  specification  before 
and  after  adaptation  can  be  arbitrary,  the  specification  during  adaptation  should 
be  a  safety  specification.  This  is  due  to  the  fact  that  one  often  wants  the  adaptation 
to  be  completed  as  quickly  as  possible.  Hence,  it  is  desirable  not  to  delay  the 
adaptation  task  to  satisfy  the  liveness  specification  during  adaptation.  Rather,  it 
is  desirable  to  guarantee  that,  after  adaptation,  the  program  reaches  states  from 
where  its  (new)  safety  and  liveness  specification  is  satisfied.  Thus,  the  implicit 
liveness  specification  during  adaptation  is  that  the  adaptation  completes. 
Notation.  We  use  the  notation  X  Y  Y  Y  Z  to  imply  that  only  one  of  X,  Y,  or  Z 
is  true  at  a  time. 

3  Verifying  Adaptation 

In  this  section,  we  recall  the  relevant  definitions  and  results  from  [7].  These  defini¬ 
tions  include  adaptation  lattice,  transitional-invariants  and  transitional-invariant 
lattice. 

Adaptation  lattice.  An  adaptation  lattice  is  a  finite  directed  acyclic  graph  in 
which  each  node  is  labeled  with  one  or  more  predicates  and  each  edge  is  labeled 
with  an  atomic  adaptation,  such  that, 

1.  There  is  a  single  entry  node  P  having  no  incoming  edges.  The  entry  node 
is  associated  with  predicates  over  the  variables  of  the  old  program,  i.e. ,  the 
program  before  adaptation.  The  entry  node  is  also  called  a  start  node. 

2.  There  is  a  single  exit  node  Q  having  no  outgoing  edges.  The  exit  node  is 
associated  with  predicates  over  the  variables  of  the  new  program,  i.e.,  the 
program  after  adaptation.  The  exit  node  is  also  called  an  end  node. 

3.  Each  intermediate  node  R  that  has  at  least  one  incoming  and  outgoing  edge 
is  associated  with  predicates  over  the  variables  of  both  the  old  and  the  new 
program. 

In  the  context  of  adaptation,  the  program  adds  and  removes  components,  and 
thus,  the  program  during  adaptation  consists  of  actions  of  the  old  program  and  the 
new  program.  Therefore,  we  consider  intermediate  programs  obtained  after  one  or 
more  atomic  adaptations.  Similar  to  the  invariants  that  are  used  to  identify  “legal” 
program  states  and  are  closed  under  program  execution,  we  define  transitional- 
invariants. 

Transitional-invariant.  A  transitional-invariant  is  a  predicate  that  is  true 
throughout  the  execution  of  an  intermediate  program  and  is  closed  under  the 
old  program  actions  that  are  not  yet  removed  and  the  new  program  actions  that 
are  already  added.  However,  the  atomic  adaptations  do  not  necessarily  preserve 
the  transitional-invariant.  Now,  we  define  transitional-invariant  lattice. 
Transitional-invariant  lattice.  A  transitional-invariant  lattice  is  an  adaptation 
lattice  with  each  node  having  one  predicate  and  that  satisfies  the  following  five 
conditions  (see  Fig.  1  for  an  example): 

1.  The  entry  node  P  is  associated  with  an  invariant  Sp  of  the  program  before 
adaptation. 

2.  The  exit  node  Q  is  associated  with  an  invariant  Sq  of  the  program  after 
adaptation. 


3.  Each  intermediate  node  R  is  associated  with  a  transitional- invariant  TS } j, 
such  that  any  intermediate  program  at  R  (i.e. ,  intermediate  program  obtained 
by  performing  adaptations  from  the  entry  node  to  R)  satisfies  the  (safety) 
specification  during  adaptation  from  TS  r. 

4.  If  a  node  labeled  Ri  has  an  outgoing  edge  labeled  A  to  a  node  labeled  Rj ,  then 
performing  atomic  adaptation  A  in  any  state  where  TS  Rt  holds  and  guard  of 
A  is  true  results  in  a  state  where  TS  Rj  holds,  and  the  transition  obtained  by 
A  satisfies  the  safety  specification  during  adaptation. 

5.  If  a  node  labeled  R  has  outgoing  edges  labeled  ai,a2,  ...,cifc  to  nodes  labeled 
R,\ ,  /?2, ....  -Rfc;  respectively,  then  in  any  execution  of  any  intermediate  program 
at  R  that  starts  in  a  state  where  TS r  is  true,  eventually,  the  guard  of  at  least 
one  atomic  adaptation  a*,  1  <  i  <  k,  becomes  true  and  remains  true  thereafter 
and,  hence,  eventually  some  aq  will  be  performed. 

Remark.  An  intermediate  node  R  could  be  reached  by  multiple  paths.  Therefore, 
it  is  required  that  TS r  be  met  for  each  intermediate  program  corresponding  to 
the  path. 

Now,  to  prove  the  correctness  of  adaptation,  we  need  to  find  a  transitional- 
invariant  lattice  corresponding  to  the  adaptation.  This  is  stated  formally  by  the 
following  theorem:  (We  refer  readers  to  [17]  for  the  proof.) 


Fig.  1.  An  example  of  a  transitional-invariant  lattice 


Theorem.  Given  Sp  as  the  invariant  of  the  program  before  adaptation  and 
Sq  as  the  invariant  of  the  program  after  adaptation,  if  there  is  a  transitional- 
invariant  lattice  for  an  adaptation  with  entry  node  associated  with  Sp  and  exit 
node  associated  with  Sq,  then  the  adaptation  depicted  by  that  lattice  is  correct 
(i.e.,  while  the  adaptation  is  being  performed  the  specification  during  adaptation 
is  satisfied  and  after  the  adaptation  completes,  the  application  satisfies  the  new 
specification).  □ 


4  Concurrency  v/s  Complexity  of  Verification 

In  an  adaptation,  the  atomic  adaptations  occurring  at  different  processes  may 
occur  in  an  asynchronous  manner.  Therefore,  to  verify  a  given  adaptation,  we 
need  to  consider  all  possible  orderings  of  concurrent  atomic  adaptations.  As  a 
result,  in  the  adaptation  lattice,  we  have  multiple  paths  from  start  node  to  end 
node  to  encompass  all  possible  orderings  among  concurrent  atomic  adaptations. 
Each  path  from  start  node  to  end  node  in  the  lattice  denotes  a  possible  adaptation 


from  an  old  program  to  a  new  program.  We  call  a  path  in  the  lattice  from  the 
start  node  to  the  end  node  as  an  adaptation  path. 

Putting  concurrent  atomic  adaptations  in  various  possible  orderings  is  a  po¬ 
tential  cause  of  the  explosion  in  size  of  the  adaptation  lattice.  For  example,  if  an 
adaptation  consists  of  n  atomic  adaptations  that  can  be  executed  concurrently, 
then  there  are  n!  different  orderings  and  2"  —  2  different  intermediate  programs. 
Thus,  2"  —  2  transitional-invariants  need  to  be  identified  corresponding  to  each 
intermediate  program.  The  adaptation  lattice  in  this  case  is  as  shown  in  Fig.  2(a) 
for  n  =  3.  To  identify  all  these  transitional- invariants  and  verify  the  corresponding 
intermediate  programs  is  a  difficult  process. 


Fig.  2.  Executing  three  atomic  adaptations 


Clearly,  the  specification  during  adaptation  is  satisfied,  if  the  adaptation  fol¬ 
lows  any  path  in  the  lattice.  So  instead  of  choosing  to  verify  all  adaptation  paths, 
if  we  verify  only  one  path  (e.g.  ai,  <23,  <22),  then  the  lattice  would  be  as  shown  in 
Fig.  2(b).  For  specification  during  adaptation  to  be  satisfied  the  adaptation  must 
follow  this  path,  i.e. ,  ai  lias  to  occur  before  03,  and  <23  has  to  occur  before  02-  In 
this  case  there  is  no  concurrency  of  adaptation,  and  we  are  able  to  reduce  the  cost 
of  verification  from  0(2")  to  O(n).  Specifically,  for  n  concurrent  atomic  adapta¬ 
tions,  the  number  of  transitional-invariants  that  needs  to  be  identified  is  reduced 
to  n  —  1. 

Alternatively,  we  could  have  chosen  a  lattice  as  shown  in  Fig.  2(c).  In  this 
case  the  cost  of  verification  is  more  compared  to  the  lattice  in  Fig.  2(b),  but  less 
compared  to  the  lattice  in  Fig.  2(a).  Also,  the  concurrency  in  the  lattice  in  Fig. 
2(c)  is  more  compared  to  the  lattice  in  Fig.  2(b),  but  less  when  compared  to  the 
lattice  in  Fig.  2(a). 

Thus,  based  on  the  tradeoff  between  concurrency  of  adaptation  and  complexity 
of  verifying  that  adaptation,  we  can  choose  a  subgraph  (sublattice)  of  a  given 
lattice  that  has  all  the  properties  of  the  adaptation  lattice  as  defined  in  Section 
3.  The  adaptation  in  this  case  has  to  be  constrained  so  that  it  follows  some  path 
in  the  sublattice. 


5  Case  Study 

In  this  section,  we  illustrate  the  tradeoff  between  concurrency  of  adaptation  and 
complexity  of  verifying  that  adaptation  using  the  replacement  of  mutual  exclusion 
protocols.  We  describe  adaptation  from  a  centralized  mutual  exclusion  protocol 
to  a  distributed  mutual  exclusion  protocol,  and  demonstrate  how  the  ordering 
among  atomic  adaptations  reduces  the  size  of  the  adaptation  lattice.  We  choose 
these  two  extreme  solutions  of  mutual  exclusion  protocols,  because  it  is  easy  to 
explain  the  tradeoff  in  this  context. 

We  use  the  component  family  design  [18]  to  model  these  two  protocols.  The 
component  family  design  focuses  on  building  a  family  of  components  involved 
in  providing  similar  functionality.  For  separating  the  adaptation  concern,  each 
component  in  a  family  consists  of  two  parts  (i)  a  functional  part ,  and  (ii)  an 
adapt- active  part.  The  adapt-active  part  consists  of  actions  that  are  needed  only 
during  adaptation. 

In  this  case  study,  we  build  a  family  of  mutual  exclusion  protocols.  We  con¬ 
sider  two  protocols  in  the  family,  a  centralized  mutual  exclusion  protocol  and  a 
distributed  mutual  exclusion  protocol.  We  consider  mutual  exclusion  by  Felten 
and  Rabinovich  [8]  for  centralized  mutual  exclusion  protocol,  and  by  Ricart  and 
Agrawala  [9]  for  distributed  mutual  exclusion  protocol. 

In  the  following  subsections,  we  describe  the  details  of  these  protocols.  These 
protocols  are  installed  on  a  distributed  program  consisting  of  N  processes.  Each 
process  is  denoted  by  i  (1  <  i  <  N).  Each  process  i  has  two  output  variables 
tryi  and  reseU,  which  are  inputs  to  the  mutual  exclusion  protocol;  and  two  input 
variables  criti  and  re?Tij,  which  are  outputs  of  the  mutual  exclusion  protocol.  When 
a  process  wants  access  to  the  critical  section,  it  sets  tryi  to  true.  The  process  can 
access  the  critical  section  only  when  criti  is  set  to  true.  When  it  has  finished 
accessing  the  critical  section  it  sets  reset-i  to  true.  When  a  process  is  not  accessing 
or  requesting  the  critical  section,  the  mutual  exclusion  protocol  sets  renii  to  true. 

5.1  Centralized  Mutual  Exclusion  Protocol 

In  centralized  mutual  exclusion  protocol,  one  of  the  node  acts  as  a  master  node. 
The  role  of  the  master  node  is  to  assign  the  permission  to  the  critical  section  to 
the  requesting  nodes.  If  a  client  needs  to  access  the  critical  section,  it  requests  the 
token  from  the  master.  The  master  maintains  a  list  of  nodes  that  have  requested 
the  token.  It  grants  the  token  to  the  first  node  in  the  queue.  A  client  node  after 
finishing  its  critical  section  informs  the  master;  whereupon  the  master  will  remove 
that  client  from  the  list,  and  grant  the  token  to  the  next  node  in  the  list. 

The  centralized  mutual  exclusion  protocol  is  shown  in  Fig.  3.  Each  fraction 
CentralME./j  of  the  protocol  is  installed  at  the  corresponding  process  i  of  the 
distributed  program.  The  master  fraction  is  denoted  by  CentralME.m  (m  =  /,  for 
some  *).  A  client  requests  the  token  by  sending  REQUEST  message  to  the  master. 
The  master  maintains  two  data  structures:  the  activeQ  contains  a  list  of  nodes 
that  have  requested  the  token,  and  which  will  receive  the  token  without  further 
action  by  the  master;  the  passiveQ  is  a  list  of  nodes  that  have  requested  the  token, 
but  whose  requests  will  not  be  satisfied  until  the  master  sends  further  messages. 
A  client  node  receives  the  token  by  the  arrival  of  the  GRANT  message.  The  GRANT 
message  contains  a  queue  of  nodes  that  have  requested  the  token.  When  a  client 


Component  CentralME 
const  minLen ,  m 


Fraction  CentralME.  fi 


input  tryi,  reseti 

:  bool 

output  criti 

:  bool 

{initially  false} 

remi 

:  bool 

{initially  true} 

var  afterMe ,  forward  Q 

:  list 

{initially  (/>} 

lastbatch 

:  int 

{initially  0} 

critReq,  sendingReq 

:  bool 

{initially  false} 

activeQ,  passiveQ 

:  list 

{initially  <p} 

batch 

:  int 

{initially  0} 

rcvReq  :  / i  =  m  A  rev  REQUEST  from  fj  — »  passiveQ  :=  passiveQ  o  j ; 
rcvDone  :  / i  =  m  A  rev  DONE  from  fj  —*■  activeQ  removeMatchedItem(  activeQ,  j)] 

Grant  :  fi  =  m  A  activeQ  =  <j>  A  passiveQ  ^  0  — ►  activeQ  :=  passiveQ ; 

passiveQ  </>; 

send  GRANT  (batch,  activeQ )  to  head  (activeQ); 
batch  batch  +  1; 


Forward  :  fi  =  m  A  length( activeQ)  <  minLen  A  passiveQ  ^  (f>  — * 

send  FORWARD  (&a£c/i,  passiveQ)  to  tail  (activeQ); 

batch  batch  +  1; 

activeQ  :=  activeQ  o  passiveQ ; 

passiveQ  :=  </>; 


start_try  :  tryi  A  — »  critReq  — *  critReq,  sendingReq  :=  true; 

remi  :=  false; 


sndReq  :  critReq  A  sendingReq  — »  send  REQUEST  to  m; 

sendingReq  false; 

allowCrit  :  rev  GRANT(6num,  Q)  from  m  V  rev  GRANT(6rmm,  Q)  from  fj  — » 

criti  true; 

critReq  :=  false; 

lastbatch  bnum ; 

afterMe  :=removeFirstItem(Q); 


exit  :  criti  A  reseti  — »  criti  :=  false; 

remi  true; 
send  DONE  to  m; 
if  forwardQ  ^ 

afterMe  :=  afterMe  o  forwardQ; 
forwardQ  :=</>; 
lastbatch  :=  lastbatch  +  1; 
if  afterMe  ^  f> 

send  GRANT(Zast6atcti,  afterMe)  to  head  (afterMe)] 
rcvFwd  :  rev  F0RWARD(6num,  Q)  from  m  if  bnum  =  lastbatch  +  1  A  —>criti 

send  GRANT(fermra,  Q)  to  head(Q); 
elseif  bnum  >  lastbatch 
forwardQ  :=  Q; 


Fig.  3.  Centralized  Mutual  Exclusion  Protocol  [8] 


receives  a  GRANT  message,  it  removes  its  own  ID  from  the  head  of  the  queue,  and 
after  finishing  its  critical  section,  it  sends  the  GRANT  message  to  the  first  node 
in  the  queue.  A  client  node  after  finishing  its  critical  section  also  sends  a  DONE 
message  to  the  master.  A  master  removes  the  ID  of  the  node  from  its  activeQ 
after  it  receives  DONE  from  that  node. 

The  invariant  of  the  centralized  mutual  exclusion  protocol  is  as  shown  in  Fig.  4. 
Informally,  the  invariant  states  that  the  execution  of  CentralME  protocol  satisfies 
the  following  conditions:  (?)  if  a  process  is  in  the  master’s  activeQ  list,  then  either 
it  is  accessing  the  critical  section  or  its  request  is  still  not  granted,  (ii)  if  a  process 
is  in  the  master’s  passiveQ  list,  then  its  request  to  access  the  critical  section  is  still 
not  granted,  and  (Hi)  if  a  process  is  accessing  the  critical  section,  then  all  processes 
who  requested  critical  section  before  it  have  finished  accessing  the  critical  section, 


Ti  AT2  AT3,  where 


2"i  =  Vi  :  fi  E  m.activeQ  =4*  fi.critReq  Y  criti 

T2  =  Vi  :  fi  6  m.passiveQ  =>-  fi.critReq  A  — 1  criti 

T3  =  Vi  :  criti  =r*  /i  G  m.activeQ  A 

Vj  :  fj  G  precedeList(m.  activeQ,  fi)  ::  #ch.fj  .m(DONE)  =  1  A 
Vfc  :  fk  £  succeedList (m. passiveQ,  fk)  ::  (fk  -critReq  A  -» critk) 

Fig.  4.  Invariant  of  CentralME 


and  all  processes  who  requested  critical  section  after  it  are  still  waiting  for  access 
to  critical  section. 

5.2  Distributed  Mutual  Exclusion  Protocol 

In  distributed  mutual  exclusion  protocol,  the  nodes  communicate  with  each  other 
to  arrive  at  a  consensus  on  what  process  will  get  access  to  the  critical  section.  There 
are  several  distributed  mutual  exclusion  protocols;  most  of  them  can  be  classified 
into  two  categories:  quorum-based  and  token-based  protocols.  In  quorum-based 
protocols,  each  node  maintains  its  own  local  lock  that  can  be  given  to  one  process 
at  a  time.  In  order  to  attain  a  mutual  exclusion,  a  process  must  obtain  local  locks 
on  a  set  of  nodes  called  a  quorum.  Quorums  are  defined  in  such  a  way  that  any 
two  of  them  have  at  least  one  node  in  common,  which  ensures  at  most  one  process 
can  succeed  in  obtaining  local  locks  from  a  quorum.  In  token-based  protocols, 
the  mutual  exclusion  privilege  is  associated  with  the  possession  of  a  token.  The 
protocols  ensure  that  there  is  always  only  one  valid  token  in  the  system,  and  hence, 
mutual  exclusion  is  enforced.  We  consider  the  Ricart-Agrawala  protocol  [9]  for 
achieving  mutual  exclusion  in  a  network,  which  is  one  of  the  well-known  protocols 
in  distributed  computing. 

The  distributed  mutual  exclusion  protocol  is  shown  in  Fig.  5.  A  node  making 
an  attempt  to  invoke  mutual  exclusion  sends  a  REQUEST  message  to  all  other 
nodes.  Upon  receipt  of  the  REQUEST  message,  the  other  node  either  sends  a  REPLY 
immediately  or  defers  a  response  until  after  it  leaves  its  own  critical  section.  A 
REPLY  message  is  returned  immediately  if  the  originator  of  the  REQUEST  message 
has  priority;  otherwise,  the  REPLY  is  delayed.  The  priority  order  decision  is  made 
by  comparing  a  sequence  number  present  in  each  REQUEST  message.  If  the  sequence 
numbers  are  equal,  the  node  numbers  are  compared  to  determine  which  node  will 
get  access  to  critical  section  first.  A  node  can  access  the  critical  section  after  all 
other  nodes  have  sent  their  REPLY  messages. 

The  invariant  of  the  distributed  mutual  exclusion  protocol  is  as  shown  in  Fig.  6. 
Informally,  the  invariant  states  that  the  execution  of  RicartAgrawalaME  satisfies 
the  following  conditions:  (?)  if  fj  has  received  a  request  from  fi,  then  the  highest 
sequence  number  ( highestseqno )  of  fj  is  greater  than  or  equal  to  the  sequence 
number  ( seqno )  of  fi,  and  (ii)  if  fj  has  received  a  request  from  fi,  it  will  grant 
its  permission  to  f,  if  fj  itself  is  not  requesting  for  access  to  the  critical  section, 
or  the  sequence  number  of  /,  is  less  than  fj’ s  sequence  number,  or  if  the  sequence 
numbers  are  same,  then  i  <  j. 

Notation.  Instead  of  CentralME. fi  (respectively,  RicartAgrawalaME./j),  we  use 
fi  if  the  component  CentralME  (respectively,  RicartAgrawalaME)  is  obvious  from 


Component  RicartAgrawalaME 
const  N 


Fraction  RicartAgrawalaME. fi 


input  tryi,reseti 
output  criti 
rerrii 
var  seqno 

highestseqno 

replycnt 

critReq,  sendingReq 
def Reply 


bool 

bool  {initially  false} 

bool  {initially  true} 

int  {initially  arbitrary} 

int  {initially  0} 

int  {initially  arbitrary} 

bool  {initially  false} 

array  [1  :  N]  of  bool 

{initially  false} 


start_try  :  tryi  A  — «  critReq  — ►  critReq ,  sendingReq  true; 

rerrii  :=  false; 

sndReq  :  critReq  A  sendingReq  —*■  seqno  highestseqno  +  1; 

replycnt  :=  N  —  1; 
for  p  =  1  to  N 
if  i  p 

send  REQUEST(seqno)  to  fp\ 
sendingReq  :  —  false; 

rcvRep  :  rev  REPLY  from  fj  — >  replycnt  :=  replycnt  —  1; 
allowCrit  :  critReq  A  replycnt  =  0  — »  criti  :=  true; 

critReq  false; 

exit  :  criti  A  reseti  — *  criti  false; 

rerrii  true; 
for  j  =  1  to  N 
if  defReply[j] 

defReply[j]  false; 
send  REPLY  to  fj ; 

rcvReqt  :  rev  REQUEST(k)  from  fj  — ►  if  highestseqno  <  k 

highestseqno  k\ 

defReply[j]  :=  critReq  A  (seqno,  i)  -<  ( k,j)\ 
if  ~‘defReply[j] 
send  REPLY  to  fj ; 


Fig.  5.  RicartAgrawala  Mutual  Exclusion  Protocol  [9] 


Si  A  S2,  where 

51  =  request _received(/i ,  fj)  fi. seqno  <  fj.hsn 

52  =  granted  (fi,fj)  —'fj. critReq  V  (fi.  seqno,  i)  -<  (fj  .seqno,  j) 

granted  (fi,  fj)  —  request_received  (fi,  fj)  A  —>fj.defRep[i\ 

request_received(/i ,  fj)  =  requested(/i,  fj)  A  ch.fi.  f j  (REQUEST)  =  0 

requested (fi,fj)  —  fi  ^  fj  A  (fi. (rev  REPLY  from  fj)  V 

(fi. critReq  A  —'fi. sendingReq)  V  (fi. critReq  A  fi. replycnt  =  0)) 

Fig.  6.  Invariant  of  RicartAgrawalaME 


the  context.  We  refer  the  variable  var  of  fraction  /-,  as  fi.var ,  and  of  process  i  as 
vari. 


5.3  Adaptation 

Centralized  mutual  exclusion  protocols  requires  less  message  passing  between 
nodes,  but  is  vulnerable  to  load  on  the  master.  When  the  master  node  is  un¬ 
der  heavy  load,  the  system  should  dynamically  adapt  to  a  distributed  mutual 
exclusion  protocol.  We  consider  the  adaptation  from  the  centralized  mutual  ex- 


elusion  protocol  (cf.  Fig.  3)  to  the  distributed  mutual  exclusion  protocol  (cf.  Fig. 
5). 

One  way  to  do  this  adaptation  would  be  to  initially  stop  all  the  processes  from 
requesting  access  to  critical  section.  After  all  processes  that  requested  access  to 
the  critical  section  have  finished  accessing  the  critical  section,  the  corresponding 
fractions  can  be  changed.  The  processes  can  start  requesting  access  to  critical  sec¬ 
tion  only  when  the  adaptation  (protocol  replacement)  is  complete.  This  approach 
is  similar  to  doing  the  adaptation  statically,  i.e. ,  to  stop  the  application  and  re¬ 
compile  (or  reload)  the  application  with  the  new  protocol  and  then  restart  the 
application.  This  approach  of  doing  adaptation  leads  to  unnecessary  interruption 
and  blocking. 

Another  way  of  doing  adaptation  is  to  do  the  protocol  replacement  without 
blocking  (stopping)  the  entire  application.  It  would  be  desirable  to  let  processes 
continue  requesting  access  to  critical  section  while  the  protocols  are  being  replaced. 
The  adaptation  is  done  as  shown  in  Fig.  7,  which  describes  the  adapt-active  parts 
of  each  fractions  that  are  involved  in  the  adaptation.  The  adapt-active  parts  of  the 
fractions  contain  the  atomic  adaptations  that  are  associated  with  them.  The  name 
of  each  atomic  adaptation  has  two  subscripts  (e.g.,  au)-  The  first  subscript  denotes 
the  order  in  the  sequence  of  atomic  adaptation.  Since  all  atomic  adaptations  has 
the  same  first  subscript,  it  means  that  they  can  be  executed  concurrently  (in  any 
order).  The  second  subscript  denotes  the  process  that  the  atomic  adaptation  is 
associated  with.  At  each  process  i,  there  are  two  atomic  adaptations  that  have 
same  name  an,  one  associated  with  CentralME./,;  fraction  and  other  associated 
with  RicartAgrawalaME./^  fraction.  This  implies  that  removal  of  CentralME. /* 
and  addition  of  RicartAgrawalaME. /^  needs  to  be  done  in  an  atomic  manner. 


Adapt-active  parts 

Fraction  :  CentralME. fi 
a-i i  :  —<criti  — >  remove  fi ; 

Fraction  :  RicartAgrawalaME. /* 
a-i i  :  an  (CentralME. /j)  — »  add  fi; 

Fig.  7.  Adaptation  from  CentralME  to  RicartAgrawalaME 


In  this  adaptation,  the  fraction  at  a  process  is  replaced  if  that  process  is  not 
accessing  the  critical  section.  During  adaptation,  some  process  will  have  the  old 
fractions  running  while  some  will  have  the  new  fractions  running.  If  the  new  (Ri¬ 
cartAgrawalaME)  fraction  receives  any  message  (REQUEST,  DONE,  GRANT)  from  the 
old  (CentralME)  fraction,  then  that  message  is  discarded.  Any  message  (REQUEST) 
sent  by  the  new  (RicartAgrawalaME)  fraction  to  the  process  still  using  the  old 
(CentralME)  fraction  is  buffered,  and  is  made  available  to  the  new  fraction  once 
it  replaces  the  old  fraction  at  that  process.  In  this  adaptation,  the  process  with 
new  fraction  can  start  requesting  access  to  critical  section  before  the  adaptation 
completes.  However,  that  process  will  be  granted  access  to  critical  section  only 
when  all  processes  have  completed  their  atomic  adaptation.  Further,  the  system 
does  not  have  to  wait  for  all  processes  to  be  out  of  critical  section  before  initiating 
adaptation. 


T,  a  T2 


“> crit1 


~'crit1  a  ~>crit2 

&13 

Si  A  S2 

(b) 

Fig.  8.  Transitional-invariant  lattices 

The  transitional-invariant  lattice  in  this  case  consists  of  2N  —  2  transitional- 
invariants,  and  N\  orderings  as  atomic  adaptations  are  interleaved  in  all  possible 
ways.  To  verify  the  adaptation,  each  node  and  each  edge  in  the  lattice  needs  to 
be  verified  independently.  The  complexity  of  the  verification  grows  exponentially 
with  the  number  of  processes.  The  transitional  invariant  lattice  for  the  adaptation 
of  Fig.  7  for  N  =  3  is  shown  in  Fig.  8(a).  To  simplify  the  verification,  we  model 
the  adaptation  as  discussed  next. 

Constrained  Adaptation.  To  simplify  verification,  we  model  the  adapta¬ 
tion  by  introducing  constraints  on  the  atomic  adaptations.  We  introduce  priority 
among  the  processes  such  that  a  process  having  a  lower  ID  executes  an  atomic 
adaptation  before  a  process  having  a  higher  ID.  This  is  shown  in  Fig.  9.  In  this 
case,  as  shown  in  Fig.  8(b),  there  is  only  one  ordering  and  n  —  1  transitional- 
invariants. 


T1  a  T2 


-'criti  a  ~'crit2  ~,crit1  a  ~'crit3  ->crit2  a  ~'crit3 


Si  a  S2 


(a) 


Adapt-active  parts 

Fraction  :  CentralME./* 

a-i i  :  (i  ^  1  A  ai(i  — i))  A  —i criti  — »  remove  /$; 

Fraction  :  RicartAgrawalaME./i 
a-i i  :  aij  (CentralME./i)  — »  add  fi\ 

Fig.  9.  Constrained  Adaptation  from  CentralME  to  RicartAgrawalaME 


6  Discussion 

In  this  section,  we  explain  some  of  the  issues  related  to  our  work. 

How  is  the  verification  of  the  program  before  adaptation  and  the  program  after 
adaptation  related  to  our  approach?  Why  is  it  necessary  to  verify  program  during 
adaptation? 

Since  the  program  and  the  specification  before  adaptation  is  fixed,  we  can  use 
existing  techniques,  such  as  model-checking  and  theorem-proving,  to  verify  the 
program  before  adaptation.  These  techniques  can  also  be  used  to  verify  the  pro¬ 
gram  after  adaptation,  as  the  program  and  specification  after  adaptation  is  also 


fixed,  though  different  from  the  old  program  and  specification.  However,  verifying 
the  old  and  new  program  is  not  enough,  as  the  specification  during  adaptation 
is  also  to  be  satisfied.  For  example,  as  discussed  in  Section  5,  while  replacing 
mutual  exclusion  protocol  we  still  want  to  satisfy  mutual  exclusion  property  dur¬ 
ing  adaptation.  Moreover,  since  the  program  during  adaptation  is  not  fixed,  it 
is  not  straightforward  to  use  the  existing  techniques  such  as  model-checking  and 
theorem-proving  for  verification.  As  shown  in  the  Section  3,  the  intermediate  pro¬ 
grams  and  the  transitional-invariants  need  to  identified  for  verifying  program  dur¬ 
ing  adaptation.  The  results  of  verification  of  the  old  and  the  new  program  are 
used  in  instantiating  the  start  node  and  the  end  node  in  the  lattice. 

How  does  enforcing  ordering  among  atomic  adaptations  affect  the  execution  of 
adaptation? 

Executing  the  atomic  adaptations  in  a  predefined  order  may  increase  the 
amount  of  time  it  takes  to  complete  the  adaptation  compared  to  the  case  where 
these  atomic  adaptations  can  be  executed  concurrently.  However,  since  all  atomic 
adaptations  need  to  be  executed  in  order  for  the  adaptation  to  complete,  a  delay 
in  any  atomic  adaptation  may  delay  the  entire  adaptation.  In  order  to  reduce  the 
verification  complexity  while  minimizing  the  execution  time  of  adaptation,  one 
can  enforce  the  ordering  only  on  a  subset  of  concurrent  atomic  adaptations. 

Can  the  verification  complexity  be  reduced  in  presence  of  concurrent  atomic  adap¬ 
tations? 

In  this  paper,  we  reduced  the  verification  complexity  by  introducing  constraints 
on  the  adaptation  to  enforce  some  ordering  among  concurrent  atomic  adaptations. 
In  case  where  the  concurrent  atomic  adaptations  are  similar,  the  different  adapta¬ 
tion  paths  in  the  adaptation  lattice  would  be  also  similar.  If  it  is  possible  to  verify 
a  subset  of  these  paths  and  imply  the  correctness  of  other  paths  based  on  the  sim¬ 
ilarity  among  atomic  adaptations,  then  this  approach  will  reduce  the  verification 
complexity  in  presence  of  concurrent  atomic  adaptations. 

7  Conclusion 

In  this  paper  we  discussed  an  approach  based  on  the  adaptation  lattice  to  ver¬ 
ify  the  dynamic  adaptation  and  the  tradeoff  between  concurrency  of  adaptation 
and  complexity  of  verifying  that  adaptation.  We  discussed  how  the  enforcing  of 
ordering  among  atomic  adaptations  can  reduce  the  complexity  of  verifying  adap¬ 
tation.  We  presented  our  results  using  the  case  study  of  dynamic  adaptation  from 
centralized  mutual  exclusion  protocol  to  a  distributed  mutual  exclusion  protocol. 

The  results  from  our  case  study  in  this  paper  suggests  that  by  setting  ap¬ 
propriate  constraints  on  adaptation,  we  can  reduce  the  complexity  of  verifying 
the  adaptation,  and  thereby,  making  it  easy  to  provide  assurance  guarantees  for 
adaptation. 

As  discussed  in  Section  6,  we  would  like  to  reduce  the  verification  complexity 
without  altering  the  concurrency  of  adaptation.  In  this  context,  we  would  like 
to  investigate  if  it  is  possible  to  provide  assurance  of  the  entire  lattice  by  veri¬ 
fying  only  a  subset  of  the  lattice.  Though  not  formally  established,  our  current 
results  suggest  that  this  may  be  possible  in  case  where  there  is  similarity  among 
adaptation  paths  in  the  lattice.  As  a  part  of  future  work,  we  want  to  use  some  of 


the  results  from  partial  order  reduction  in  model-checking  and  apply  them  in  the 
context  of  verifying  adaptation  using  adaptation  lattice. 
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